Minimum Common String Partition Problem: Hardness and Approximations

نویسندگان

  • Avraham Goldstein
  • Petr Kolman
  • Jie Zheng
چکیده

String comparison is a fundamental problem in computer science, with applications in areas such as computational biology, text processing or compression. In this paper we address the minimum common string partition problem, a string comparison problem with tight connection to the problem of sorting by reversals with duplicates, a key problem in genome rearrangement. A partition of a string A is a sequence P = (P1, P2, . . . , Pm) of strings, called the blocks, whose concatenation is equal to A. Given a partition P of a string A and a partition Q of a string B, we say that the pair 〈P,Q〉 is a common partition of A and B if Q is a permutation of P. The minimum common string partition problem (MCSP) is to find a common partition of two strings A and B with the minimum number of blocks. The restricted version of MCSP where each letter occurs at most k times in each input string, is denoted by k-MCSP. In this paper, we show that 2-MCSP (and therefore MCSP) is NP-hard and, moreover, even APX-hard. We describe a 1.1037-approximation for 2-MCSP and a linear time 4-approximation algorithm for 3-MCSP. We are not aware of any better approximations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimum Common String Partition Parameterized by Partition Size Is Fixed-Parameter Tractable

The NP-hard Minimum Common String Partition problem asks whether two strings x and y can each be partitioned into at most k substrings such that both partitions use exactly the same substrings in a different order. We present the first fixed-parameter algorithm for Minimum Common String Partition using only parameter k.

متن کامل

Construct, Merge, Solve and Adapt: Application to Unbalanced Minimum Common String Partition

In this paper we present the application of a recently proposed, general, algorithm for combinatorial optimization to the unbalanced minimum common string partition problem. The algorithm, which is labelled Construct, Merge, Solve & Adapt, works on subinstances of the tackled problem instances. At each iteration, the incumbent sub-instance is modified by adding solution components found in prob...

متن کامل

An Integer Programming Formulation of the Minimum Common String Partition Problem

We consider the problem of finding a minimum common string partition (MCSP) of two strings, which is an NP-hard problem. The MCSP problem is closely related to genome comparison and rearrangement, an important field in Computational Biology. In this paper, we map the MCSP problem into a graph applying a prior technique and using this graph, we develop an Integer Linear Programming (ILP) formula...

متن کامل

Computational Performance Evaluation of Two Integer Linear Programming Models for the Minimum Common String Partition Problem

In the minimum common string partition (MCSP) problem two related input strings are given. “Related” refers to the property that both strings consist of the same set of letters appearing the same number of times in each of the two strings. The MCSP seeks a minimum cardinality partitioning of one string into non-overlapping substrings that is also a valid partitioning for the second string. This...

متن کامل

Minimum Common String Partition Revisited

Minimum Common String Partition (MCSP) has drawn much attention due to its application in genome rearrangement. In this paper, we investigate three variants of MCSP: MCSP , which requires that there are at most c elements in the alphabet; d-MCSP, which requires the occurrence of each element to be bounded by d; and x-balanced MCSP, which requires the length of blocks being in range (n/k − x, n/...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Electr. J. Comb.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2004